Automatic Detection and Correction for Chinese Misspelled Words Using Phonological and Orthographic Similarities

نویسندگان

  • Tao-Hsing Chang
  • Hsueh-Chih Chen
  • Yuen-Hsien Tseng
  • Jian-Liang Zheng
چکیده

How to detect and correct misspelled words in documents is a very important issue for Mandarin and Japanese. This paper uses phonological similarity and orthographic similarity co-occurrence to train linear regression model. Using ACL-SIGHAN 2013 Bake-off Dataset, experimental results indicate that the detection F-score, error location F-score of our proposed method for Subtask 1 is 0.70 and 0.43 respectively, and the correction accuracy of the proposed method for Subtask 1 is 0.39.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

字形相似別字之自動校正方法 (Automatic Correction for Graphemic Chinese Misspelled Words) [In Chinese]

No matter that learning Chinese as a first or second language, a quite important issue, misspelled words, needs to be addressed. Many studies proposed that there was a suggestion of correcting misspelled words for students who are still schooling as well as a suggestion of teaching and learning strategies of Chinese characters for teachers. Although in schooling, it does to prevent students who...

متن کامل

Cross-linguistic Analysis of Developmental Dyslexia─ Does Phonology Matter in Learning to Read Chinese?

Phonological processing deficit has been ascertained to be the core cognitive deficit of developmental dyslexia—in alphabetic languages at least. Measures of phonological processing typically include three components: phonemic awareness, phonological working memory, and rapid automatic naming. Among the three tasks, phonemic awareness was the most powerful predictor of reading abilities. Becaus...

متن کامل

A Misspelling Intelligent Analysis Approach for Correcting Misspelled Words in English Text

This paper proposes an innovative MIA (Misspelling Intelligent Analysis) approach for efficient detection and intelligent correction of misspelled words. An integrity spelling correction approach is needed to consider both non-word errors and real-word errors. The MIA approach takes advantage of word frequency statistics, lexicon data, character distance and conditional probability for ranking ...

متن کامل

Automatic Activation of Phonological Information during Handwritten Production of Chinese Characters

The present study investigated whether phonological information is activated automatically and, if so, how it affects handwritten production of Chinese characters. The form preparation paradigm was adopted. In the homogeneous blocks, target characters shared the first orthographic component and the pronunciation (Experiment 1), shared the first orthographic component only (Experiment 2), or sha...

متن کامل

ارائه یک رتبه‌بند برای خطایاب معنایی با استفاده از ویژگی‌های حساس به متن

Nowadays, a large volume of documents is generated daily. These documents generated by different persons, thus, the documents contain spelling errors. These spelling errors cause quality of the documents are decrease. Therefore, existence of automatic writing assistance tools such as spell checker/corrector can help to improve their quality. Context-sensitive are misspelled words that have been...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013